专利摘要:
method for combining images referring to three-dimensional content. The present invention relates to a method for superimposing images on three-dimensional content, wherein a video stream comprising three-dimensional content and a depth map for superimposing images on three-dimensional content is received. When the video stream has been received, the images are superimposed on the three-dimensional content at a depth position dependent on the overlay depth map (dm). The overlay depth map contains information about the depth of three-dimensional content and is inserted as an image contained in a frame (c) of the video stream. The depth map has a smaller number of pixels than a two-dimensional image associated with three-dimensional content. The invention also relates to devices which allow the constitution of said methods.
公开号:BR112013001910A2
申请号:R112013001910
申请日:2011-07-28
公开日:2019-09-17
发明作者:Giovanni Ballocca;Paolo D'amato;Dario Pennisi
申请人:3Dswitch S R L;Sisvel Tech S R L;
IPC主号:
专利说明:

METHOD FOR COMBINING IMAGES RELATING TO THREE-DIMENSIONAL CONTENT
DESCRIPTION
TECHNICAL FIELD The present invention relates to methods and devices for combining, within a device for stereoscopic display, locally generated images superimposed on a three-dimensional content received by the device itself.
PREVIOUS TECHNIQUE It is known that television equipment (television sets and decoders) can generate images locally containing text and graphics and superimpose them on the images being received; it is thus possible to provide useful information of various types to the user while the video is displayed in the background.
These images can be generated from the information received with the video signal (as is the case, for example, with subtitles and some guides to electronic programs, also known as EPGs), or they can provide information about the configuration and settings of the decoder or television set (eg, menus or the bar indicating the volume level and other parameters).
Nowadays, the amount of 3D content available on the market has grown considerably; the enjoyment of this content is no longer limited to cinemas, and users can watch 3D videos at home on their own television sets.
Therefore, also for 3D videos, there is a need to superimpose the images generated locally on the television images that are being received.
Compared to a 2D video stream, the superimposition of images in a 3D video stream is more complex, since it is necessary to take into account the different depth arrangements of the objects included in a single realization of the video stream.
2/30
Patent application EP2157803A1 teaches how to position a text in a position where it always remains in front of the television image. In particular, if the 3D content is spread as a two-dimensional image plus a depth matrix, the latter can also be used to define the position of the text.
This solution has the drawback that it uses a large depth map, since it is used to create the pair of right and left images (which when combined, produce the 3D effect) starting from a basic two-dimensional image. In addition to requiring considerable computational effort, when the map is analyzed to define the position of the text, the size of that map also involves a high use of bandwidth when the map is transmitted to a receiver.
Patent application W02008 / 038205 describes a method for composing 3D images (to be displayed on a 3D display) composed of the main 3D images and other 2D or 3D images, such as text or graphics, not superimposed on the main ones. All types of images are composed of a 2D image and a related depth map. The depth map of all types of images is combined to receive a corresponding 2D image for the reconstruction of the image to be displayed.
Patent application EP-1705929 describes a method for organizing the transmission of 3D images by combining in a frame one or two 2D images composing the image, with depth information, useful for the reconstruction of the 3D image at reception.
OBJECTIVES AND BRIEF DESCRIPTION OF THE INVENTION The objective of the present invention is to provide a method and system for combining images with a three-dimensional content carried by a stereoscopic video stream, which allows to overcome the drawbacks of the prior art.
3/30
In particular, it is an objective of the present invention to provide a method for superimposing images on 3D content that requires a lower computational cost at the level of the 3D content player.
It is another objective of the present invention to provide a method for transmitting the information necessary for superimposing images on those carried by a stereoscopic video stream, which does not require a high bandwidth usage and which is robust in relation to encoding and decoding the stereoscopic video stream.
These and other objectives of the present invention are achieved by means of a method and a system for superimposing images on those carried by stereoscopic video stream that incorporate the characteristics explained in the attached claims, which are intended to be an integral part of the present description.
The general idea on the basis of the present invention is to display an element superimposed on a stereoscopic stream using in the playback phase an overlay depth map encoded in an image contained in a frame of a stereoscopic stream. The depth map used in the present invention is not intended to encode the stereoscopic video stream, as it is transmitted for the sole purpose of providing the decoder or television set with information useful for superimposing locally generated images on the stereoscopic image in an appropriate manner. . For that, the depth map has a low resolution and, therefore, a smaller number of pixels than the stereoscopic pair, thus limiting the bandwidth occupation. This is possible because the map is not used for the generation of the three-dimensional image, but only for the proper positioning of the overlays.
In a preferred embodiment, the frame carries the composite image that
4/30 comprises a right image, a left image and the depth map, suitably multiplexed.
In one embodiment, the right and left images are arranged according to a traditional format, eg a side-by-side, top-to-bottom or chessboard format, considering that the depth map is inserted in a free area the composite frame, not being intended for display.
In an alternative embodiment, the right and left images are arranged in an innovative frame. In this realization, the frame comprises a number of pixels greater than the sum of the pixels of the original format (that is, before encoding) of both the right and left images, which are thus inserted without being subjected to fractionation into ten parts.
In this realization, the pixels of the first image (eg, the left image) are inserted into the composite image without undergoing changes, whereas the second image is subdivided into regions whose pixels are arranged in free areas of the composite image.
This solution offers the advantage that one of the images is left unchanged, which results in a better quality of the reconstructed image.
Advantageously, the second image is then divided into the fewest possible regions, in order to maximize the spatial correlation between the pixels and reduce the generation of errors during the compression step.
In an advantageous realization, the regions of the second image are inserted in the composite image only by means of translation or rototranslation operations, thus leaving the relationship between horizontal and vertical resolution unchanged.
In another embodiment, at least one of the regions in which the second image has been divided undergoes a stage of specular inversion, that is, it is overlaid with
5/30 with respect to an axis (in particular one side), being arranged in the composite image in such a way that one of its sides borders on one side of the other image having identical or similar pixels on the border side due to the strong correlation between homologous pixels of the two images right and left, that is, pixels of the two images that are positioned in the same row and column.
This solution offers the advantage of reducing the generation of errors in the border area. More advantageously, the regions in which the second image is subdivided are rectangular in shape; compared to the solution using triangular regions arranged with border areas that cross the composite image in diagonal directions, this choice provides the reduction of errors produced by a subsequent compression step, especially if the latter acts on square blocks of pixels (eg, 16x16 for the H.264 standard).
According to a particularly advantageous embodiment, the formation of errors is further reduced or even completely eliminated with the introduction of redundancy in the composite image, that is, by copying the same groups of pixels several times. In particular, this is achieved by breaking the basic image to be introduced in the composite image in regions with these dimensions where the total number of pixels in those regions exceeds the number of pixels in the image to be divided. In other words, the image is divided into regions, of which at least two comprise a part of the image in common. The part of the common image is a border area between regions adjacent to each other in the dismounted image. The size of this common part, preferably, depends on the type of compression to be subsequently applied to the composite image, and can act as a buffer area that will be partially or completely removed when the dismounted image is reconstructed. As compression can introduce errors in the areas bordering those regions, eliminating
6/30 the buffer areas or at least its outermost part, it is possible to eliminate all errors and reconstruct an image that is true to the original image.
Other objectives and advantages of the present invention will become more apparent from the following description of some of its achievements, which are provided by way of non-limiting examples.
BRIEF DESCRIPTION OF THE DRAWINGS Said realizations will be described with reference to the accompanying drawings, in which: Fig. 1 shows a block diagram of a device for multiplexing the right image and the left image in a composite image;
Fig. 2 is a flow chart of a method performed by the device of Fig. 1;
Fig. 3 shows a first way of disassembling an image to be inserted in the composite image;
Fig. 4 shows a first step in the construction of a composite image according to an embodiment of the present invention;
Fig. 5 shows a complete composite image of Fig. 4;
Fig. 6 shows a second way of disassembling an image to be inserted into a composite image;
Fig. 7 shows a composite image that includes the image of Fig. 6;
Fig. 8 shows a third way of disassembling an image to be inserted into a composite image;
Fig. 9 shows a composite image that includes the image of Fig. 8;
Fig. 10 shows a block diagram of a receiver for receiving a composite image generated according to the method of the present invention;
Fig. 11 shows some steps of reconstructing the disassembled image according to the method of Fig. 8 and inserted in the composite image received by the receiver of Fig.
7/30
10;
Fig. 12 is a flow chart of a method for reconstructing the multiplexed right and left images into a composite image of the type shown in Fig. 9;
Fig. 13 shows a composite image according to a fourth embodiment of the present invention;
Figs. 14a to 14f show a right image and a left image in the different processing steps carried out to insert them in the composite image of Fig. 13.
Where appropriate, similar structures, components, materials and / or elements are indicated by means of similar references in the different figures.
DETAILED DESCRIPTION OF THE INVENTION Fig. 1 shows a block diagram of a device 100 for generating a stereoscopic video stream 101 with a depth map for superimposing images on a video content carried by the video stream.
For the purposes of the present invention, a three-dimensional (or 3D) content is an image or video that is perceived by the observer as having varying depth, in which elements can project from the plane of the screen on which the said image or video is being displayed. or designed.
The expression “to superimpose two images” refers to any form of combination of two images, for example, a transparency, half transparency or complete opacity.
The present invention also applies to any type of overprinting, whether static or dynamic, that is, having fixed or variable graphic characteristics over time, which in turn can be either two-dimensional or three-dimensional.
The depth of a three-dimensional content refers to the dimension of the content
8/30 three-dimensional that enters the screen along an axis orthogonal to the screen on which the 3D content is being displayed. For the purposes of this description, the screen corresponds to a point of zero depth, while the “minimum depth” point is that point of 3D content that is perceived by the user as the closest to him, that is, the furthest from the screen . Thus, the point of "maximum depth" is that point that is perceived by the observer as the deepest on the screen, that is, the furthest from you, even beyond the plane of the screen.
In Fig. 1, device 100 receives two image sequences 102 and 103, e.g., two video streams, respectively intended for the left eye (L) and the right eye (R), and a sequence of the depth map 106 Each depth map in sequence 106 is associated with a pair of right and left images that belong to sequences 102 and 103, respectively. In this realization, the depth map is generated by algorithms known by you that compare a right image with a left image and return a matrix (the depth map) having a size equal to the pixels of one of the two compared images, and whose elements have a value that is proportional to the depth of each pixel being displayed. Another technique for generating a depth map is based on measuring the object in the scene from the pair of camcorders recording the scene: This distance can be easily measured using a laser. In the case of artificial video streams generated with the help of electronic computers, video cameras are virtual cameras, as they consist of two points of view of a given scene artificially created by a computer. In another embodiment, a depth map is associated with multiple pairs of right and left images; in this case, the value chosen for each element of the depth map is the minimum pixel depth value
9/30 in the different courts. Preferably, in this embodiment, the depth map is introduced once each group of frames is associated, in order to reduce the load on the device 100, in which a piece of information is also introduced that allows the association of a map of depth with multiple pairs of right and left images.
As an alternative to the example in Fig. 1, depth maps of sequence 106 can be generated within device 100. In that case, device 100 comprises a suitable module that receives L and R images of sequences 102 and
103 and then generate the corresponding depth maps.
The device 100 allows to implement a method for multiplexing two images of the two sequences 102 and 103 and the depth map of the sequence 106.
To implement the method for multiplexing the right and left images and the depth map, device 100 comprises a disassembly module
104 by breaking the input image (the right image in the example in Fig. 1) into a plurality of sub-images, each corresponding to a region of the received image, a subsampling and filtering module 107 for processing the depth map, and a mounting module 105 capable of inserting the pixels of the received images, including the depth map, into a single composite image to be provided on its output. If sequence 106 is not required to be processed, module 107 can be omitted. This may be the case, for example, when the depth map is generated by laser and has, at the beginning, a lower resolution than that of the L and R images.
An example of a multiplexing method implemented by device 100 will now be described with reference to Fig. 2.
The method starts at step 200. Subsequently (step 201), one of the two
10/30 input images (right or left) is divided into a plurality of regions, as shown in Fig. 3. In the example in Fig. 3, the dismounted image is an R frame of a 720p video stream, that is, a progressive format with a resolution of 1280 x 720 pixels, 25/30 fps (frames per second).
The R frame of Fig. 3 comes from the video stream 103 that carries the images directed to the right eye, being disassembled in three regions R1, R2 and R3.
Disassembly of the R image is achieved by dividing it into two parts of the same size and then subdividing one of these parts into two parts of the same size.
The R1 region has a size of 640x720 pixels, being obtained by taking all the first 640 pixels from each row. The R2 region has a size of 640x360 pixels, being obtained by removing the 641 to 720 pixels from the first 360 rows. The R3 region has a size of 640x360 pixels, being obtained by removing the remaining pixels from the R image, that is, the pixels from 641 to 720 of the last 360 rows.
In the example in Fig. 1, the disassembly step of the R image is performed by module 104, which receives an input image R (in this case, the R frame) and sends three sub-images (that is, three groups of pixels) corresponding to the three regions R1, R2 and R3.
Subsequently (steps 202, 203 and 205) the composite image C is constructed, which comprises the information belonging to both the right and left images and the received depth map; in the example described herein, said composite image C is a frame of the outgoing stereoscopic video stream and therefore is also called a container frame.
First of all (step 202), the input image received by device 100 and not disassembled by device 105 (the left image L in the example in Fig. 1) is inserted unchanged in a container frame that is sized to include
11/30 all pixels from both input images. For example, if the input images are 1280x720 pixels in size, then a container frame suitable for both will be a 1920x1080 pixel frame, eg a 1080p video stream frame (progressive format with 1920 x 1080 pixels, 25/30 frames per second).
In the example in Fig. 4, the left image L is inserted into the container frame C and positioned in the upper left corner. This is achieved by copying the 1280x720 pixels of the L image to an area C1 that consists of the first 1280 pixels of the first 720 rows of the C container frame.
When in the following description reference is made to inserting an image in a frame, or transferring or copying pixels from one frame to another, it is understood that this means performing a procedure that generates (using hardware and / or software) a new frame comprising the same pixels as the source image.
The (software and / or hardware) techniques for reproducing a source image (or a group of pixels from a source image) in a target image are considered to be unimportant for the purposes of the present invention and will not be discussed further here, since they are known to those skilled in the art.
In the next step 203, the image dismounted in step 201 by module 104 is inserted into the container frame. This is done by module 105 by copying the pixels of the disassembled image in the container frame C in its areas that were not occupied by the image L, that is, areas that are external to area C1.
In order to obtain the best possible compression and reduce the generation of errors when decompressing the video stream, the pixels of the sub-images sent by module 104 are copied preserving the respective spatial relationships. In other words,
12/30 regions R1, R2 and R3 are copied in the respective areas of table C without undergoing any deformation, exclusively through translation and / or rotation operations.
An example of the container frame C sent by module 105 is shown in Fig. 5.
The R1 region is copied in the last 640 pixels of the first 720 rows (area C2), that is, close to the image previously copied L.
The R2 and R3 regions are copied under area C1, that is, respectively in areas C3 and C4, which respectively comprise the first 640 pixels and the following 640 pixels of the last 360 rows.
As an alternative to the solution shown in Fig. 5, the regions R2 and R3 can be copied in the container frame C in separate areas (that is, neither overlapping nor neighboring) separated by a group of pixels, in order to reduce the bordering regions. The operations for inserting the L and R images in the container frame do not imply any changes in the balance between the horizontal and vertical resolutions.
In the free pixels of frame C, that is, in area C5, module 105 inserts, in the form of an image, the depth map (DM) that belongs to the stereoscopic pair L and R (step 205). Before step 205, the DM depth map can be subsampled, filtered or further processed by module 107.
The depth map is preferably coded as a grayscale image, whose information content can therefore be carried individually by the luminance signal, since the chrominances are null; this allows to obtain an effective compression of the container frame C.
As shown in the example in Fig. 5, the depth map inserted in table C is, preferably, an image overlay depth map and, therefore, its resolution is lower than that of pair L and R, since it is transmitted with the single purpose of
13/30 position the overlays in depth, not to generate the stereoscopic video stream. The chosen resolution of the depth map is the result of a compromise between the bit rate required for the transfer, which should be the lowest possible, and the quality of the information necessary for the proper positioning of the overlays.
In a preferred embodiment, the DM overlay depth map has a resolution of 640 x 360 pixels, corresponding to a 4-to-1 subsampling (or decimation) of the original depth map having a resolution of 1280 x 720 pixels, matching o of images L and R. Each pixel of the DM subsampled map corresponds to a 2x2 pixel region of the original map. In particular, the 4-to-1 subsampling step can be performed by selecting a row between two and a column between two of the original map.
In another embodiment, after the decimation, the DM overlay depth map goes through a processing step in which it is divided into 16x16 pixel macroblocks, and the pixels that belong to the same macroblock receive a single depth value. Preferably, this value is equal to the minimum depth within the macroblock, since this is the most significant value for the proper positioning of the overlays.
Alternatively, this value is equal to the average depth value within the macroblock.
The choice of 16x16-pixel macroblocks is particularly advantageous when the compression standard in use is H.264, because these macroblocks coincide with those used in the H.264 standard. With this solution, in fact, compression generates errors and requires a lower bit rate.
The subdivision into blocks of 8 x 8 or 4 x 4 can also be considered advantageous,
14/30 since, due to the particular characteristics of the H.264 compression algorithm, the benefits of compression are obtained if the pixels within these blocks are all the same.
Alternatively, but subdividing into blocks or macroblocks within which the pixels are all the same, the 640x360 depth map can be filtered by a two-dimensional low-pass filter. Compression advantages are also obtained in this case, because the highest spatial frequencies are eliminated or reduced.
Alternatively, the depth map can have a resolution of 160 x 90 pixels, resulting from a 64-to-1 subsampling, where each pixel of the DM depth map corresponds to an 8x8 region of the original map.
In another embodiment, the DM overlay depth map inserted in the container frame C may have an uneven resolution; in particular, the bottom half or third of the overlay depth map has a higher resolution than the top. This solution results in a particularly advantageous solution as it refers to the placement of subtitles or other information such as the audio volume, which are usually placed at the bottom of the image. The receiver can thus use more accurate information about the depth of the pixels in a region of interest, eg the lower third of the 3D image and can therefore position the images (text or graphics) correctly in that region. At least, the overlay depth map can even only contain information about the depth of the pixels (of all of only some of its parts) located in a region of interest, in particular in the lower half or the lower third of the three-dimensional content.
In another embodiment, a region of the container frame that is not occupied by
15/30 left or right images, by their parts or by the overlay depth map is intended to receive a mark that is necessary for the reconstruction of the right and left images at the demultiplexer level. For example, the markup may relate to how the composite image was created. Preferably, the marking may contain information useful for using the depth map properly.
The pixels in this marking region are, for example, painted in two colors (eg, black and white) in order to create a barcode of some kind, eg, linear or two-dimensional, which carries the marking information .
When the transfer of both the images and the received overlay depth map (and possibly also the marking) in the container frame has been completed, the method implemented by the device 100 ends, and the container frame can be compressed and transmitted by a channel. communications and / or recorded on a suitable medium (eg CD, DVD, Blu-ray, mass memory, etc.).
As the multiplexing operations explained above do not alter the spatial relationships between the pixels of a region or image, the video stream emitted by the device 100 can be compressed considerably while preserving good possibilities that the image will be reconstructed very faithfully according to the transmitted without creating significant errors.
Before describing other achievements, it should be stated that, in the preferred embodiment, the division of the frame R into three regions R1, R2 and R3 corresponds to the division of the frame in the smallest possible number of regions, taking into account the space available in the composite image and the space occupied by the left image inserted unchanged in the container frame.
16/30
Said lower number is, in other words, the minimum number of regions necessary to occupy the space left available in the container frame C by the left image.
In general, therefore, the minimum number of regions in which the image should be disassembled is defined as a function of the format of the source images (right and left images) and the composite target image (container frame C).
Preferably, the image to be inserted in the frame is disassembled taking into account the need to divide the image (eg, R in the example above) into the smallest number of rectangular regions.
In another embodiment, the right image R is disassembled as shown in Fig. 6.
The RT region corresponds to an R1 region in Fig. 3 and therefore comprises the first 640 pixels of all 720 rows of the image.
The R2 'region comprises the 320 pixel columns adjacent to an RT region, whereas the R3' region comprises the last 320 pixel columns.
The container frame C can thus be constructed as shown in Fig. 7, with the regions R2 'and R3' rotated by 90 ° and arranged in the areas C3 'and C4' under the image L and the region RT.
The regions R2 'and R3' thus rotated occupy 720 pixels of 320 rows; therefore, areas C3 'and C4' are separated from areas C1 and C2 that contain the pixels copied from the L image and the RT region.
Preferably, areas C3 'and C4' are separated from other areas C1 and 02 by at least one save line. In particular, it is advantageous and preferable to copy the pixels of the regions R2 'and R3' in the last rows of the container frame C.
As in this case the container frame is composed of 1080 rows, in the realization of Fig. 7 the rotated regions R2 'and R3' are separated from the image above L and the region RT
17/30 by a save track with a height of 40 pixels.
In the example of Fig. 7, the regions R2 'and R3' are separated from each other, in order to be surrounded by pixels of a predefined color (eg, white or black) not coming from the right and left images. Thus, the border areas between the regions containing pixels from the right and left images are reduced, while also reducing any errors caused by image compression and the maximization of the compression rate.
As an alternative to positioning R2 'and R3' in the last rows of the container frame C (as described in reference to Fig. 7), in a preferred embodiment R2 'and R3' are positioned so that a save strip with 32 pixel rows of height is left between the bottom edge of L and the top edge of R2 'and R3'. This provides a second save strip with 8 rows of pixels high between the bottom edge of R2 'and R3' and the bottom edge of C. Continuing with the exploration of the width of the container frame, it is possible to position R2 'and R3' of so that they are completely surrounded by pixels that do not come from the right or left image.
Finally, in area C5 'on the lower right edge of table C, the overlapping depth map (DM') is inserted with a resolution of 160 x 90 pixels, obtained by subsampling the original depth map as previously described. In general, the overlap depth map can have any resolution, as long as it is contained within the free space of frame C. To better explore the available space, the overlap depth map can have a rotation and / or disassembly step before be inserted in frame C.
In another embodiment, which is described herein with reference to Figs. 8 and 9, module 104 extracts three sub-images R1 ”, R2 and R3”, whose total sum of pixels exceeds that of the
18/30 image disassembled.
The region R1 ”corresponds to an RT region of Fig. 6, whereas R2” and R3 ”include the area of the regions R2 'and R3' plus an additional area (Ra2 and Ra3) that allows to minimize the creation of errors during the stage of image compression.
The R1 segment is therefore a region with a dimension of 640 x 720 pixels and that occupies the first columns of the R frame to be disassembled.
The R3 segment "occupies the last columns of the R frame to be disassembled, being bordering on the central region R2". R3 ”includes, on the left side (which limits with R2”), a range of Ra3 buffer containing pixels in common with the R2 region ”. In other 10 words, the last columns of R2 ”and the first columns of R3” (which constitute the Ra3 buffer range) coincide.
Preferably, the size of the Ra3 buffer strip is chosen as a function of the type of compression to be subsequently applied to the container frame C, and in general to the video stream that contains it. In particular, said strip has a dimension that 15 is twice that of the elementary processing unit used in the compression process. For example, the H.264 standard indicates the dismantling of the image in 16x16 pixel macroblocks, each representing this elementary processing unit of this standard. Based on this assumption, the Ra3 strip is 32 pixels wide. The R3 ”segment, therefore, has a size of 352 (320 + 32) x720 pixels, and 20 comprises the pixels of the last 352 columns of the R image.
The segment R2 ”occupies the central part of the image R to be disassembled and includes, on its left side, the buffer strip Ra2 having the same dimension as the strip Ra3. In the example that takes into account the H.264 compression standard, the Ra2 band is thus 32 pixels wide and comprises pixels in common with the R1 region ”. Segment 25 R2 ”, therefore has a size of 352x720 pixels and comprises the pixels of the columns
19/30 of 608 (640 of R1-32) to 978 of table R.
The three sub-images that belong to the regions R1 ”, R2” and R3 ”sent by module 104 (visible in Fig. 8) are then inserted in the container frame C as shown in Fig. 9. The regions R2” and R3 ”are rotated 90 ° and the pixels are copied in the last rows of table C (areas indicated by C3 ”and C4'j providing a certain number of save pixels that separate areas C3” and C4 ”from areas C1 and C2 that include the pixels of L and R1 images. ”In the case shown in Fig. 9, this save strip is 8 pixels wide.
Also in this realization, the overlapping depth map (DM ') is inserted in area C5' on the lower right edge of frame C.
The C frame thus obtained is then compressed and transmitted or saved by a storage medium (eg, a DVD). For this purpose, compression means are provided which are adapted to compress an image or a video signal, together with the means for recording and / or transmitting the image or the compressed video signal.
Fig. 10 shows a block diagram of a 1100 receiver that decompresses the container frame (compressed case) received or read from a medium, reconstructs the two images, right and left, and makes them available, together with the overlay depth map. related, for a display device (eg, a television set) allowing the enjoyment of 3D content with images superimposed on video content. The 1100 receiver can be a digital converter or a receiver integrated with a television set. It should be noted that when the 1100 receiver is a digital converter not integrated with a television set, it must internally use the map it generates (eg, subtitles, EPG and related menus). In addition, the 1100 receiver will have to send the depth map (eg via the HDMI interface) to the television set, as the latter will need it for
20/30 to properly position your own chart (eg, your menus).
The same observations made for the 1100 receiver also apply to a reader (eg, a DVD player) that reads a container frame (possibly compressed) and processes it to obtain a pair of frames corresponding to the right and left images inserted in the container frame (possibly compressed) read by the reader.
Again with reference to Fig. 10, the receiver receives (via cable or antenna) a compressed stereoscopic video stream 1101 and decompresses it through a decompression module 1102, thus obtaining a video stream comprising a corresponding sequence of frames C ' to C frames. If an ideal channel exists or if container frames are being read from mass memory or data media (Blu-ray, CD, DVD), C 'frames correspond to the C frame frames that carry information about the right and left images and the overlay depth map, except for any errors introduced by the compression process.
These C 'frames are then sent to a reconstruction module 1103, which performs image reconstruction and a depth map extraction method as described below with reference to Figs. 11 and 12.
It is apparent that if the video stream is not compressed, decompression module 1102 can be omitted and the video signal can be supplied directly to reconstruction module 1103.
The reconstruction process begins at step 1300, when the uncompressed container frame C 'is received. Reconstruction module 1103 extracts (step 1301) the left image L by copying the first 720x1080 pixels from the uncompressed frame into a new frame that is smaller than the container frame, eg a flow frame
21/30
720p. The image L thus reconstructed is sent to the receiver 1100 (step 1302). Subsequently, the method extracts the right image R from the container frame C '.
The step of extracting the right image starts by copying (step 1303) a part of the area R1 "included in table C '. In more detail, the pixels of the first 624 (640-16) columns of R1 ”are copied to the corresponding first 624 columns of the new frame that represents the reconstructed image Rout, as shown in Fig. 11. Incidentally, this removes it from the reconstruction step the 16 RT 'columns that are most subject to the creation of errors, for example, through the effect of the movement estimation procedure performed by the H.264 compression standard.
Then a central part of R2 is extracted (step 1304). From the uncompressed frame C (which, as already said, corresponds to frame C in Fig. 9), the pixels in area C3 ”(corresponding to the source region R2”) are selected and a 90 ° rotation is inverse to that performed in multiplexer 100, which returns them to the original row / column condition, that is, the one shown in Fig. 8. At that point, the first and last sixteen (16) columns of R2 ”are eliminated and the remaining pixel columns 352-32 = 320 are copied in the free columns adjacent to the ones just copied from R1.
By cutting the 16 outermost columns in the R2 region ”, these columns are eliminated in a region that is most likely to have errors. The width of the cut area (in this case, 16 columns) depends on the type of compression used. Said area is preferably equal to the elementary processing unit used in the compression process; in the case described here, the H.264 standard operates in blocks of 16x16 pixels, so 16 columns must be cut.
With respect to R3 ”(step 1305), pixels from region C4” are extracted from frame C and the sub-image R3 is brought back to the original row / column format (see Fig. 8).
22/30
Subsequently, the first 16 pixel columns are eliminated (corresponding to half of the Ra3 area) and the remaining 352-16 = 336 pixel columns are copied into the last free columns to the left of the reconstructed frame. Like R2 ”, also in R3” the cut area is equal to the elementary processing unit used by the compression process.
Of course, for both R2 ”and R3” regions, the rotation step can be performed in a virtual way, that is, the same result in terms of extracting the pixels of interest can be obtained by copying the pixels of a row in the reconstructed frame. area C3 ”(if R2, C4 if R3) in a column of the new Rout frame, except for the last 16 rows of area C3 (if R2”, C4 ”if R3”) corresponding to the sixteen columns to be cut, shown in Fig. 8.
At this point, the right Rout image has been completely reconstructed and can be sent (step 1306).
Finally, reconstruction module 1103 extracts (step 1308) the DM 'overlapping depth map by copying into a register the luminance values of the last 160 x 90 pixels of the uncompressed container frame C, corresponding to area C5'. The contents of that record are sent to the 1100 receiver (step 1309) and will be used to define the depth position of the images (text or graphics) to be combined with the three-dimensional content carried by the stereoscopic video stream; in particular, it will be used to combine the images to be superimposed on the three-dimensional content.
As an alternative or, in addition to sending the content of the depth map and the L and R images extracted from the input frames, the 1100 receiver comprises a character generator and / or a graphic generator and combines other images with the L and R images , that is, with the three-dimensional content. The images to be combined
23/30 are selected from a memory area of the receiver and can be stored in the manufacture of the receiver (eg, graphics of some menus or channel numbers) or can be extracted from the video stream (eg. , program guide information and subtitles).
These images are combined with the three-dimensional content at depth positions that depend on the overlay depth maps extracted from the video stream. In particular, for each stereoscopic image (produced by the pair of L and R images), the combined image is placed at the minimum depth point of the stereoscopic image.
After the images have been combined with the 3D content, in this realization the receiver 1100 sends a pair of L * and R * images that, when reproduced, will be perceived by the user as a three-dimensional content corresponding to the original (produced by the L and R images) with superimposed images on it, eg subtitles, menus, graphics, etc.
The process of reconstructing the right and left images and the depth map contained in the container frame C 'is thus completed (step 1307). This process is repeated for each frame of the video stream received by the receiver 1100, so that the sending consists of two video streams 1104 and 1105 for the right image and for the left image, respectively, and a data signal deducted from the overlay depth map.
The process of reconstructing the right and left images and the overlapping depth map described above with reference to Figs. 10, 11 and 12 is based on the assumption that the demultiplexer 1100 knows how the container frame C was constructed and can thus extract the right and left images and the overlay depth map.
24/30
Of course, this is possible if the multiplexing method is standardized.
In order to take into account the fact that the container frame can be generated according to any of the aforementioned methods, or in any way according to any of the methods that use the solution that is the subject of the attached claims, the demultiplexer uses the marking information contained in a predefined region of the composite image (eg, a barcode, as previously described) to learn how the content of the composite image should be unpacked and how to reconstruct the right and left images and the overlay depth map.
After decoding the markup, the demultiplexer will know the position of the image unchanged (eg, the left image in the above mentioned examples), as well as the positions and any transformations (rotation, translation or similar) of the regions in which the other image was disassembled ( eg the right image in the examples above) and the position of the overlay depth map.
With this information, the demultiplexer can thus extract the unchanged image (eg the left image) and the depth map and reconstruct the unmounted image (eg the right image).
Although the present invention has been illustrated so far with reference to some preferred and advantageous embodiments, it is clear that it is not limited to said realizations and that many changes can be made to it by a technician on the subject who wishes to combine it into a composite image two images referring to two different perspectives (right and left) of an object or a scene.
For example, the electronic modules that provide the aforementioned devices, in particular the device 100 and the receiver 1100, can be subdivided and distributed in several ways; in addition, they may be provided under the
25/30 in the form of hardware modules or as software algorithms implemented by the processor, in particular a video processor equipped with adequate memory areas to temporarily store the incoming frames received. These modules can therefore perform in parallel or in series one or more video processing steps of the image multiplexing and demultiplexing methods according to the present invention.
It is also apparent that, although preferred embodiments relate to the multiplexing of two 720p video streams into one 1080p video stream, other formats can also be used, such as two 640x480 video streams in a 1280x720 video stream, or two 320x200 video streams in one 640x480 video stream.
Nor is the invention limited to a certain type of arrangement of the composite image, since different solutions for the generation of the composite image can offer specific advantages.
For example, the embodiments described above with reference to Figs. 1 to 12 offer the advantage that they only perform translation or rototranslation operations, thus requiring only small computational power.
Alternatively, it is conceivable that the images are also subjected to steps of specular inversion, in addition to the referred rotation and / or translation operations, to obtain the composite image of the type shown in Fig. 13.
These additional operations are performed with the purpose of maximizing the boundary perimeters between the regions containing homologous pixels, thus exploring the strong correlation between them and minimizing the errors introduced by the subsequent compression step. In the example of Figs. 13 and 14 it was assumed, for clarity, that the two right and left images are identical, even if in general
26/30 differ slightly.
In this figure, the left image L (shown in Fig. 14a) is positioned on the upper right edge of the container frame C, in order to occupy the last 1280 pixels of the first 720 rows. As in the examples described above, image L is thus copied unchanged in container frame C.
Instead, the right image R is disassembled according to the example in Fig. 3; Fig. 14b shows the image R divided into three regions R1, R2 and R3.
Subsequently, some regions (regions R1 and R3 in the example in Fig. 14) undergo a specular inversion operation; the invention can occur with respect to a vertical axis (that is, parallel to a column of the image) or to a horizontal axis (that is, parallel to a row of the image).
In the case of the inversion relative to a vertical axis, the pixels in column N (where N is an integer between 1 and 1080, 1080 being the number of columns in the image) are copied in column 1080 + 1-N.
In the case of an inversion relative to a horizontal axis, the pixels of row M (where M is an integer between 1 and 720, 720 being the number of rows in the image) are copied to row 720 + 1-N.
Figs. 14c and 14d show a region R1 extracted from the image R and inverted (R1rot) relative to a vertical axis, in particular relative to a vertical side.
The inverted region R1inv is inserted in the first 640 pixels of the first 640 rows of pixels.
As can be seen in the example of Fig. 13, when R1inv is inserted rotated in the container frame C, the pixels of R1inv with limits in L are very similar to the pixels of borderline L in R1inv. The spatial correlation between these pixels has the advantage of reducing the formation of errors.
27/30
Figs. 14e and 14f show a region R3 extracted from the image R of Fig. 14b and then inverted (R3inv) with respect to a horizontal axis, in particular relative to a horizontal side.
The R3inv region is inserted in the last 640 pixels of the last 360 rows. This reduces the generation of errors, since the pixels of the border regions between R3inv and L are pixels having a high spatial correlation. The pixels in this boundary region, in fact, reproduce similar or identical parts of the image.
The container frame C is then completed by inserting the R2 region.
In this example, R2 is not inverted and / or rotated because it would not be possible, in any case, to combine a boundary region of R2 with a boundary region made with homologous pixels from another region of R or L.
Finally, it is also apparent that the invention also relates to any method of demultiplexing that allows a right image and a left image to be extracted from a composite image reversing one of the aforementioned multiplexing processes that fall within the scope of protection of the present invention.
The invention therefore also refers to a method for generating a pair of images starting from a composite image, which comprises the steps of:
- generate a first (eg, the left image) of said right and left images by copying a single group of contiguous pixels from a region of said composite image,
- generate a second image (for example, the right image) by copying other groups of contiguous pixels from different regions of the said composite image.
According to one embodiment, the information for generating said second image is extracted from an area of said composite image. These
28/30 information is preferably encoded according to a bar code.
In a realization of the method for generating the right and left images, the generation of the image that was disassembled in the composite image comprises at least one step of specular inversion of a group of pixels from one of the said different regions.
In an implementation of the method for generating the right and left images, the generation of the image that was disassembled in the composite image comprises at least one step of removing the pixels from one of the regions of the composite image that comprise the pixels of that image to be reconstructed. In particular, pixels are removed from a boundary area in that region.
In one embodiment, the image that has been taken apart in different regions of the composite image is reconstructed, subjecting the pixel regions that include the pixels of the image to be taken apart only in translation and / or rotation operations.
Although the aforementioned example of realization refers to the insertion of an overlay depth map in a container frame in which one of the two images, right and left, is disassembled in several parts, it is clear that the invention is not dependent on the way in which the two right and left images are formatted within the container frame. For example, the two images can be subsampled and arranged side by side (side-by-side format) or one on top of the other (top-down format) to leave a free space in the frame where the overlay depth map can be placed. Also, one of the right and left images can be left unchanged, whereas the other can be sub-sampled to make room for the depth map.
Finally, it should be noted that the examples of the above mentioned achievements with reference to the attached drawings refer to a “total” depth map,
29/30 that is, a depth map computed by decimation or filtering a depth map of 3D content without, however, subdividing it into several parts, unlike one of the two images L and R, for example. However, this is not a limitation of the present invention, and the overlay depth map, once generated (or received), can be inserted into the container frame by an encoder, which divides it into multiple parts that will be arranged in different regions. of the container frame. For example, as is known, to encode stereoscopic content, an H.264 encoder must insert eight other rows that will be cut by the decoder; in one embodiment, the overlapping depth map can be inserted into these eight other rows by dividing it, for example, into 240 blocks of size 8x8, which when properly reassembled will form an image with dimensions proportional to the stereoscopic content carried. An example of a block arrangement can be obtained by scanning the rows of a depth map decimated by 16, therefore with a resolution of 120x72, where bands of 120x8 pixels are aligned to obtain a 1080x8-pixel image. In another embodiment, the same decimated depth map can be subdivided into a larger number of 8-pixel-high tracks using a 6-pixel offset instead of an 8-pixel offset, so that the content becomes redundant and is protection of content promoted at the limit with the main image. This appears to be particularly advantageous whenever the stereoscopic content includes a pair of multiplexed right and left images in a top-down, side-by-side or chessboard format, with a resolution to occupy all potentially displayable pixels. in the frame, for example, pixels of a 1920x1080 format.
Preferably, in the event that the table includes a pair of asymmetrically decimated images (eg, a side-by-side format in which the columns
30/30 are more decimated than the rows, or a top-down format in which only the rows are decimated, and not the columns), then the overlapping depth map is obtained by decimating a depth map with a row / column decimation ratio proportional to that used for sampling 5 of the images placed in the same frame. As an example, supposing that a side-by-side format is used for multiplexing the left and right images in the frame, the row / column decimation ratio will be 1: 2, since all the rows are maintained and the columns are divided into two. In this case, the overlay depth map can be obtained by decimating a depth map with a row / column decimation ratio of 1: 2.
It is also clear that different methods can be used to signal the area occupied by the depth map to the receiver in addition to those described above, which provides for the insertion of a mark in the image; in fact, such a mark can also be included in a signal data packet that carries the video stream.
权利要求:
Claims (4)
[1]
1. METHOD, for superimposing images on a three-dimensional content, characterized in that a video stream is received that comprises said three-dimensional content and a depth map (DM, DM ’) for the superposition of images
5 images in the said three-dimensional content, the referred depth map (DM,
DM j containing information about the depth of said three-dimensional content, being inserted as an image in a frame (C) of said video stream, the method being characterized in that said depth map (DM, DMj is transmitted for the sole purpose of allow in the playback phase the superposition of the 10 images in the referred three-dimensional content, the said superposition being made in a depth position depending on the referred depth map of overlap (DM, DMj, and in that referred depth map (DM, DMj has a fewer pixels than that of a two-dimensional image associated with said three-dimensional content.
[2]
2/4 two-dimensional image associated with the referred three-dimensional content.
5. METHOD, according to claim 4, characterized in that said three-dimensional content is an image consisting of a plurality of pixels, and in which said depth map is obtained by sub-sampling a depth map whose elements correspond to the pixel depth of said three-dimensional content.
6. METHOD, according to claim 5, characterized in that, after the subsampling of the referred depth map, the subsampled map is divided into blocks and each pixel of the block receives the same value equal to the minimum depth of the pixels of said block or to the average value of the pixel depth of the block.
7. METHOD, according to claim 5, characterized in that, before the subsampling of said depth map, the depth map is divided into blocks and each pixel of the block receives the same value equal to the minimum pixel depth of said block or the average pixel depth value for the tile.
METHOD, according to claim 6 or 7, characterized in that said blocks have a size equal to a multiple of an elementary block of 2x2 pixels.
9. METHOD, according to any of the preceding claims, characterized in that said overlapping depth map is inserted in a part of said frame not intended for display.
10. METHOD, according to any of the preceding claims, characterized in that said depth map is divided into blocks distributed in the areas of said frame (C) that are not occupied by said three-dimensional content.
11. METHOD, according to any of the preceding claims, characterized in that said frame comprises a right image, a left image and said depth map, wherein said depth map is divided into
2. METHOD, according to claim 1, characterized in that said overlapping depth map only contains information about the depth of the pixels located in the lower half, preferably in the lower third, of said three-dimensional content.
3. METHOD according to claim 1 or 2, characterized in that the overlapping depth map has a non-uniform resolution, in particular the lower half or the lower third of said depth map has a higher resolution than the part higher.
4. METHOD according to claim 1 or 2 or 3, characterized in that said overlapping depth map has a lower resolution than a
[3]
3/4 blocks distributed in regions of the frame (C) that are not occupied by said three-dimensional content, and in which said frame (C) is encoded according to the H.264 coding standard.
12. METHOD, according to any of claims 1 to 9, characterized in that said three-dimensional content comprises a two-dimensional image and information that allows the reconstruction of the other image of a stereoscopic pair, and in which said overlapping depth map is inserted in a part of the two-dimensional image.
13. METHOD according to any of claims 1 to 12, characterized in that said frame comprises a marking adapted to indicate to the recipient the position of said overlapping depth map within said frame.
METHOD, according to any of claims 1 to 13, characterized in that said video stream comprises a mark adapted to indicate to the receiver the position of said map of overlapping depth within said frame, said mark being external to the that table.
15. DEVICE, for the reproduction of three-dimensional content, comprising means adapted to receive a video stream containing three-dimensional content, means adapted to combine an image with said three-dimensional content, characterized in that said means adapted to combine an image with said three-dimensional contents are adapted to constitute a method according to any of claims 1 to 14.
16. STEREOSCOPIC VIDEO FLOW (1101), comprising a plurality of frames and characterized in that it comprises at least a three-dimensional content and at least one overlay depth map (DM, DM ’)
[4]
4/4 inserted as an image within a frame of said plurality of frames, said overlapping depth map (DM, DM ') comprising a smaller number of pixels than that of a two-dimensional image associated with said three-dimensional content, said stereoscopic video stream being adapted to be used in a method according to any of claims 1 to 14.
类似技术:
公开号 | 公开日 | 专利标题
BR112013001910A2|2019-09-17|method for combining images referring to three-dimensional content
BR112012015261B1|2021-07-06|method for generating, transmitting and receiving stereoscopic images, and related devices
KR101676504B1|2016-11-15|Demultiplexing for stereoplexed film and video applications
JP6644979B2|2020-02-12|Method and device for generating, storing, transmitting, receiving and playing back a depth map by using color components of an image belonging to a three-dimensional video stream
CN103202021B|2017-06-13|Code device, decoding apparatus, transcriber, coding method and coding/decoding method
US9571811B2|2017-02-14|Method and device for multiplexing and demultiplexing composite images relating to a three-dimensional content
ES2690492T3|2018-11-21|Procedure for the generation, transmission and reception of stereoscopic images, and related devices
JP2014517606A|2014-07-17|Method for generating, transmitting and receiving stereoscopic images, and related apparatus
ES2446165A2|2014-03-06|Method for generating, transmitting and receiving stereoscopic images and relating devices
同族专利:
公开号 | 公开日
WO2012014171A1|2012-02-02|
EA201390178A1|2013-07-30|
HUE027682T2|2016-11-28|
WO2012014171A9|2012-05-24|
CN103329543A|2013-09-25|
JP2013539256A|2013-10-17|
ES2564937T3|2016-03-30|
KR20130052621A|2013-05-22|
TW201223248A|2012-06-01|
CN103329543B|2016-06-29|
IT1401367B1|2013-07-18|
US20130135435A1|2013-05-30|
US9549163B2|2017-01-17|
EP2599319A1|2013-06-05|
ITTO20100652A1|2012-01-29|
ZA201300610B|2014-03-26|
KR101840308B1|2018-03-20|
EP2599319B1|2015-12-16|
PL2599319T3|2016-06-30|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US4467421A|1979-10-18|1984-08-21|Storage Technology Corporation|Virtual storage system and method|
US4434437A|1981-01-26|1984-02-28|Rca Corporation|Generating angular coordinate of raster scan of polar-coordinate addressed memory|
US5848198A|1993-10-08|1998-12-08|Penn; Alan Irvin|Method of and apparatus for analyzing images and deriving binary image representations|
US5691768A|1995-07-07|1997-11-25|Lucent Technologies, Inc.|Multiple resolution, multi-stream video system using a single standard decoder|
US5870097A|1995-08-04|1999-02-09|Microsoft Corporation|Method and system for improving shadowing in a graphics rendering system|
AUPQ416699A0|1999-11-19|1999-12-16|Dynamic Digital Depth Research Pty Ltd|Depth map compression technique|
US20030198290A1|2002-04-19|2003-10-23|Dynamic Digital Depth Pty.Ltd.|Image encoding system|
AU2002952873A0|2002-11-25|2002-12-12|Dynamic Digital Depth Research Pty Ltd|Image encoding system|
US20050041736A1|2003-05-07|2005-02-24|Bernie Butler-Smith|Stereoscopic television signal processing method, transmission system and viewer enhancements|
KR100528343B1|2003-07-14|2005-11-15|삼성전자주식회사|Method and apparatus for image-based rendering and editing of 3D objects|
WO2005067319A1|2003-12-25|2005-07-21|Brother Kogyo Kabushiki Kaisha|Image display device and signal processing device|
US8384763B2|2005-07-26|2013-02-26|Her Majesty the Queen in right of Canada as represented by the Minster of Industry, Through the Communications Research Centre Canada|Generating a depth map from a two-dimensional source image for stereoscopic and multiview imaging|
US20100091012A1|2006-09-28|2010-04-15|Koninklijke Philips Electronics N.V.|3 menu display|
EP2157803B1|2007-03-16|2015-02-25|Thomson Licensing|System and method for combining text with three-dimensional content|
WO2009011492A1|2007-07-13|2009-01-22|Samsung Electronics Co., Ltd.|Method and apparatus for encoding and decoding stereoscopic image format including both information of base view image and information of additional view image|
WO2009083863A1|2007-12-20|2009-07-09|Koninklijke Philips Electronics N.V.|Playback and overlay of 3d graphics onto 3d video|
KR101539935B1|2008-06-24|2015-07-28|삼성전자주식회사|Method and apparatus for processing 3D video image|
CN106101682B|2008-07-24|2019-02-22|皇家飞利浦电子股份有限公司|Versatile 3-D picture format|
EP2184713A1|2008-11-04|2010-05-12|Koninklijke Philips Electronics N.V.|Method and device for generating a depth map|
US9013551B2|2008-12-01|2015-04-21|Imax Corporation|Methods and systems for presenting three-dimensional motion pictures with content adaptive information|
EP2197217A1|2008-12-15|2010-06-16|Koninklijke Philips Electronics N.V.|Image based 3D video format|
EP2373041A4|2008-12-26|2015-05-20|Panasonic Ip Man Co Ltd|Recording medium, reproduction device, and integrated circuit|
EP2389765B1|2009-01-20|2016-01-13|Koninklijke Philips N.V.|Transferring of 3d image data|
US8269821B2|2009-01-27|2012-09-18|EchoStar Technologies, L.L.C.|Systems and methods for providing closed captioning in three-dimensional imagery|
US9438879B2|2009-02-17|2016-09-06|Koninklijke Philips N.V.|Combining 3D image and graphical data|
WO2010131314A1|2009-05-14|2010-11-18|パナソニック株式会社|Method for transmitting video data|
WO2010150976A2|2009-06-23|2010-12-29|Lg Electronics Inc.|Receiving system and method of providing 3d image|
WO2010151555A1|2009-06-24|2010-12-29|Dolby Laboratories Licensing Corporation|Method for embedding subtitles and/or graphic overlays in a 3d or multi-view video data|
US9648346B2|2009-06-25|2017-05-09|Microsoft Technology Licensing, Llc|Multi-view video compression and streaming based on viewpoints of remote viewer|
JP5446913B2|2009-06-29|2014-03-19|ソニー株式会社|Stereoscopic image data transmitting apparatus and stereoscopic image data transmitting method|
JP5369952B2|2009-07-10|2013-12-18|ソニー株式会社|Information processing apparatus and information processing method|
US20110032332A1|2009-08-07|2011-02-10|Darren Neuman|Method and system for multiple progressive 3d video format conversion|
KR101621528B1|2009-09-28|2016-05-17|삼성전자 주식회사|Display apparatus and display method of 3 dimentional video signal thereof|
JP5372687B2|2009-09-30|2013-12-18|ソニー株式会社|Transmitting apparatus, transmitting method, receiving apparatus, and receiving method|
WO2011063397A1|2009-11-23|2011-05-26|General Instrument Corporation|Depth coding as an additional channel to video sequence|
US9014276B2|2009-12-04|2015-04-21|Broadcom Corporation|Method and system for 3D video coding using SVC temporal and spatial scalabilities|
US20110149032A1|2009-12-17|2011-06-23|Silicon Image, Inc.|Transmission and handling of three-dimensional video content|
US10462414B2|2009-12-31|2019-10-29|Cable Television Laboratories, Inc.|Method and system for generation of captions over stereoscopic 3D images|
KR20120120502A|2010-01-21|2012-11-01|제너럴 인스트루먼트 코포레이션|Stereoscopic video graphics overlay|
US9398289B2|2010-02-09|2016-07-19|Samsung Electronics Co., Ltd.|Method and apparatus for converting an overlay area into a 3D image|
US20110273437A1|2010-05-04|2011-11-10|Dynamic Digital Depth Research Pty Ltd|Data Dependent Method of Configuring Stereoscopic Rendering Parameters|
US8842170B2|2010-06-01|2014-09-23|Intel Corporation|Method and apparaus for making intelligent use of active space in frame packing format|
US8718356B2|2010-08-23|2014-05-06|Texas Instruments Incorporated|Method and apparatus for 2D to 3D conversion using scene classification and face detection|
US9094660B2|2010-11-11|2015-07-28|Georgia Tech Research Corporation|Hierarchical hole-filling for depth-based view synthesis in FTV and 3D video|TWI630815B|2012-06-14|2018-07-21|杜比實驗室特許公司|Depth map delivery formats for stereoscopic and auto-stereoscopic displays|
TWI508523B|2012-06-28|2015-11-11|Chunghwa Picture Tubes Ltd|Method for processing three-dimensional images|
US9674501B2|2012-07-06|2017-06-06|Lg Electronics Inc.|Terminal for increasing visual comfort sensation of 3D object and control method thereof|
WO2014025294A1|2012-08-08|2014-02-13|Telefonaktiebolaget L M Ericsson |Processing of texture and depth images|
RU2012138174A|2012-09-06|2014-03-27|Сисвел Текнолоджи С.Р.Л.|3DZ TILE FORMAT DIGITAL STEREOSCOPIC VIDEO FLOW FORMAT METHOD|
TWI466062B|2012-10-04|2014-12-21|Ind Tech Res Inst|Method and apparatus for reconstructing three dimensional model|
US20140300702A1|2013-03-15|2014-10-09|Tagir Saydkhuzhin|Systems and Methods for 3D Photorealistic Automated Modeling|
TWI558166B|2013-04-04|2016-11-11|杜比國際公司|Depth map delivery formats for multi-view auto-stereoscopic displays|
TWI512678B|2013-10-02|2015-12-11|Univ Nat Cheng Kung|Non-transitory storage medium|
TWI602145B|2013-10-02|2017-10-11|國立成功大學|Unpacking method, device and system of packed frame|
KR101679122B1|2013-10-02|2016-11-23|내셔날 쳉쿵 유니버시티|Method, device and system for packing and unpacking color frame and original depth frame|
KR101652583B1|2013-10-02|2016-08-30|내셔날 쳉쿵 유니버시티|Method, device and system for resizing and restoring original depth frame|
TWI602144B|2013-10-02|2017-10-11|國立成功大學|Method, device and system for packing color frame and original depth frame|
TWI503788B|2013-10-02|2015-10-11|Jar Ferr Yang|Method, device and system for restoring resized depth frame into original depth frame|
TWI603290B|2013-10-02|2017-10-21|國立成功大學|Method, device and system for resizing original depth frame into resized depth frame|
CN104567758B|2013-10-29|2017-11-17|同方威视技术股份有限公司|Stereo imaging system and its method|
GB201407643D0|2014-04-30|2014-06-11|Tomtom Global Content Bv|Improved positioning relatie to a digital map for assisted and automated driving operations|
JP6391423B2|2014-10-24|2018-09-19|株式会社ソニー・インタラクティブエンタテインメント|Image generating apparatus, image extracting apparatus, image generating method, and image extracting method|
KR101626679B1|2014-11-26|2016-06-01|강원대학교산학협력단|Method for generating stereoscopic image from 2D image and for medium recording the same|
CA2992304A1|2015-07-15|2017-01-19|Blinxel Pty Ltd|System and method for image processing|
CN107850449B|2015-08-03|2021-09-03|通腾全球信息公司|Method and system for generating and using positioning reference data|
US9894342B2|2015-11-25|2018-02-13|Red Hat Israel, Ltd.|Flicker-free remoting support for server-rendered stereoscopic imaging|
EP3391647A4|2015-12-18|2019-12-11|Boe Technology Group Co. Ltd.|Method, apparatus, and non-transitory computer readable medium for generating depth maps|
CN105898274B|2016-04-13|2018-01-12|万云数码媒体有限公司|A kind of 2D plus depth 3D renderings longitudinal direction storage method based on RGB compressions|
US10362241B2|2016-12-30|2019-07-23|Microsoft Technology Licensing, Llc|Video stream delimiter for combined frame|
TWI658429B|2017-09-07|2019-05-01|宏碁股份有限公司|Image combination method and image combination system|
WO2020181104A1|2019-03-07|2020-09-10|Alibaba Group Holding Limited|Method, apparatus, medium, and server for generating multi-angle free-perspective video data|
CN111669564A|2019-03-07|2020-09-15|阿里巴巴集团控股有限公司|Image reconstruction method, system, device and computer readable storage medium|
CN109842811B|2019-04-03|2021-01-19|腾讯科技(深圳)有限公司|Method and device for implanting push information into video and electronic equipment|
法律状态:
2020-01-14| B08K| Patent lapsed as no evidence of payment of the annual fee has been furnished to inpi [chapter 8.11 patent gazette]|Free format text: EM VIRTUDE DO ARQUIVAMENTO PUBLICADO NA RPI 2542 DE 24-09-2019 E CONSIDERANDO AUSENCIA DE MANIFESTACAO DENTRO DOS PRAZOS LEGAIS, INFORMO QUE CABE SER MANTIDO O ARQUIVAMENTO DO PEDIDO DE PATENTE, CONFORME O DISPOSTO NO ARTIGO 12, DA RESOLUCAO 113/2013. |
2021-10-05| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
ITTO2010A000652A|IT1401367B1|2010-07-28|2010-07-28|METHOD TO COMBINE REFERENCE IMAGES TO A THREE-DIMENSIONAL CONTENT.|
PCT/IB2011/053361|WO2012014171A1|2010-07-28|2011-07-28|Method for combining images relating to a three-dimensional content|
[返回顶部]